Revisiting Blank Nodes in RDF to Avoid the Semantic Mismatch with SPARQL
نویسندگان
چکیده
Jointly with the release of RDF in 1999 as recommendation of the W3C, the natural problem of querying RDF data was raised. Since then, several designs and implementations of RDF query languages have been proposed (see [9] and [7] for detailed comparisons of RDF query languages). In 2004, the RDF Data Access Working Group released a first public working draft of a query language for RDF, called SPARQL [13]. Since then, SPARQL has been rapidly adopted as the standard for querying Semantic Web data. In fact, SPARQL became a W3C Recommendation in January 2008. In spite of being the standard query language for RDF, the design of SPARQL was made to keep the efficiency of the language considering the current database technology. In this direction, the current definition of the semantics of SPARQL does not consider the combined treatment of two of the distinctive features of RDF graphs, namely the semantics of blank nodes and RDFS vocabulary recommended by the W3C in the definition of RDF [10]. In fact, the semantics of SPARQL does not match in some constructions the semantics for blank nodes recommended by the W3C in [10]. To see that this is the case, consider the RDF graphs G1 and G2 shown in Figure 1. In these graphs, :b1, :b2 and :b3 are blank nodes, which are used to represent objects that are owned by John and Peter. According to the semantics for blank nodes proposed by the W3C [10, 8], these two graphs are equivalent as they can be mapped into each other . Thus, one would expect that the answer to any SPARQL query over G1 is the same as over G2. However, this is not the case for the following SPARQL query Q:
منابع مشابه
Distributed query processing in the presence of blank nodes
This paper demonstrates that the presence of blank nodes in RDF data represents a problem for distributed processing of SPARQL queries. It is shown that the usual decomposition strategies from the literature will leak information—even when information derives from a single source. It is argued that this leakage, and the proper reparational measures, need to be accounted for in a formal semantic...
متن کاملSPARQLog: SPARQL with Rules and Quantification
SPARQL has become the gold-standard for RDF query languages. Nevertheless, we believe there is further room for improving RDF query languages. In this chapter, we investigate the addition of rules and quantifier alternation to SPARQL. That extension, called SPARQLog, extends previous RDF query languages by arbitrary quantifier alternation: blank nodes may occur in the scope of all, some, or non...
متن کاملRDFLog: It’s like Datalog for RDF
RDF data is set apart from relational or XML data by its support of rich existential information in the form of blank nodes. Where in SQL databases null values are scoped over a single tuple, blank nodes in RDF can span over any number of statements and thus can be seen as existentially quantified variables. Blank node querying is considered in most RDF query languages, but blank node construct...
متن کاملRevisiting and Simplifying RDF
RDF, a simple yet expressive data model, is widely recognised as the foundation for the Semantic Web. We revisit the RDF specification and analyse five problems: literals are not addressable, blank nodes are not addressable, reification is not semantically recognised, language tags are not addressable, and language tags can not be combined. We introduce SiRDF, a simplified data model with fewer...
متن کاملOn the Semantics of SPARQL
The Resource Description Framework (RDF) is the standard data model for representing information about World Wide Web resources. Jointly with its release as Recommendation of the W3C, the natural problem of querying RDF data was raised. In the last years, the language SPARQL has become the standard query language for RDF and, in fact, a W3C Recommendation since January 2008. In this chapter, we...
متن کامل